Missing data imputation for paired stream and air temperature sensor data
نویسندگان
چکیده
Correspondence Eric Smith, Department of Statistics, Virginia Tech, 406A Hutcheson Hall, Blacksburg, VA 24061, U.S.A. Email: [email protected] Stream water temperature is an important factor in determining the impact of climate change on hydrologic systems. Near continuous monitoring of air and stream temperatures over large spatial scales is possible due to inexpensive temperature recorders. However, missing water temperature data commonly occur due to the failure or loss of equipment. Missing data creates difficulties in modeling relationships between air and stream water temperatures. It also imposes challenges if the objective is an analysis, for example, clustering streams in terms of the effect of changes in water temperature. In this work, we propose to use a novel spatial–temporal varying coefficient model to impute missing water temperatures. Modeling the relationship between air and water temperature over time and space increases the effectiveness of imputing the missing water temperatures. A parameter estimation method is developed, which utilizes the temporal covariation in the relationship, borrows strength from neighboring stream sites, and is useful for imputing sequences of missing data. A simulation study is conducted to examine the performance of the proposedmethod in comparison with several existing imputation methods. The proposed method is applied to cluster streams with missing water temperatures into groups from 156 streams with meaningful interpretations.
منابع مشابه
Missing data imputation in multivariable time series data
Multivariate time series data are found in a variety of fields such as bioinformatics, biology, genetics, astronomy, geography and finance. Many time series datasets contain missing data. Multivariate time series missing data imputation is a challenging topic and needs to be carefully considered before learning or predicting time series. Frequent researches have been done on the use of diffe...
متن کاملInfluence of Pattern of Missing Data on Performance of Imputation Methods: An Example from National Data on Drug Injection in Prisons
Background Policy makers need models to be able to detect groups at high risk of HIV infection. Incomplete records and dirty data are frequently seen in national data sets. Presence of missing data challenges the practice of model development. Several studies suggested that performance of imputation methods is acceptable when missing rate is moderate. One of the issues which was of less concern...
متن کاملچند رویکرد برخورد با مقادیر گمشده متغیرهای کمی و بررسی اثر آنها بر نتایج حاصل از یک کارآزمایی بالینی
Background and Objectives: A major challenge that affects the longitudinal studies is the problem of missing data. Missing in the data may result in the loss of part of the information which reduces the accuracy of the estimator and obtain the results will be biased and inaccurate. Therefore, it is necessary to evaluate the missing data mechanism from a longitudinal research and to consider thi...
متن کاملکاربرد جای گذاری چندگانه در تحقیقات پزشکی و اپیدمیولوژی
Data missing, which occurs for different reasons, is an unavoidable problem in epidemiological studies. It is quite widespread and, therefore, it is considered as a challenge in research design and data analysis by many methodologists. Complete case analysis is often used in studies with missing data however, this approach may result in inaccurate estimates and inferences due to bias associated...
متن کاملتحلیل مشاهدات گمشده در مطالعه اثر دوزهای مختلف مکمل ویتامین D بر مقاومت به انسولین در دوران بارداری
Introduction: The aim of this study was to impute missing data and to compare the effect of different doses of vitamin D supplementation on insulin resistance during pregnancy. Methods: A clinical trial study was done on 104 women with diabetes and gestational age less than 12 weeks between 1391 and...
متن کامل